Search CORE

37 research outputs found

Grid coevolution for adaptive simulations; application to the building of opening books in the game of Go

Author: Audouard Pierre
Chaslot Guillaume
Hoock Jean-Baptiste
Perez J.
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

International audienceThis paper presents a successful application of parallel (grid) coevolution applied to the building of an opening book (OB) in 9x9 Go. Known sayings around the game of Go are refound by the algorithm, and the resulting program was also able to credibly comment openings in professional games of 9x9 Go. Interestingly, beyond the application to the game of Go, our algorithm can be seen as a ”meta”-level for the UCT-algorithm: ”UCT applied to UCT” (instead of ”UCT applied to a random player” as usual), in order to build an OB. It is generic and could be applied as well for analyzing a given situation of a Markov Decision Process

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Adding expert knowledge and exploration in Monte-Carlo Tree Search

Author: Chaslot Guillaume
Fiter Christophe
Hoock Jean-Baptiste
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

International audienceWe present a new exploration term, more efficient than clas- sical UCT-like exploration terms and combining efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values and classi- cal online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo: { We show an expert-based improvement of Monte-Carlo simulations for nakade situations; we also emphasize some limitations of this modification. { We show a technique which preserves diversity in the Monte-Carlo simulation, which greatly improves the results in 19x19. { Whereas the UCB-based exploration term is not efficient in MoGo, we show a new exploration term which is highly efficient in MoGo. MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

Author: Chaslot Guillaume
Chatriot Louis
Fiter Christophe
Gelly Sylvain
Hoock Jean-Baptiste
Perez J.
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Lavoisier'
Publication date: 01/01/2008
Field of study

National audienceNous combinons pour de l'exploration Monte-Carlo d'arbres de l'apprentissage arti- RÉSUMÉ. ﬁciel à 4 échelles de temps : – regret en ligne, via l'utilisation d'algorithmes de bandit et d'estimateurs Monte-Carlo ; – de l'apprentissage transient, via l'utilisation d'estimateur rapide de Q-fonction (RAVE, pour Rapid Action Value Estimate) qui sont appris en ligne et utilisés pour accélérer l'explora- tion mais sont ensuite peu à peu laissés de côté à mesure que des informations plus ﬁnes sont disponibles ; – apprentissage hors-ligne, par fouille de données de jeux ; – utilisation de connaissances expertes comme information a priori. L'algorithme obtenu est plus fort que chaque élément séparément. Nous mettons en évidence par ailleurs un dilemne exploration-exploitation dans l'exploration Monte-Carlo d'arbres et obtenons une très forte amélioration par calage des paramètres correspondant. We combine for Monte-Carlo exploration machine learning at four different time ABSTRACT. scales: – online regret, through the use of bandit algorithms and Monte-Carlo estimates; – transient learning, through the use of rapid action value estimates (RAVE) which are learnt online and used for accelerating the exploration and are thereafter neglected; – ofﬂine learning, by data mining of datasets of games; – use of expert knowledge coming from the old ages as prior information

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

Grid coevolution for adaptive simulations; application to the building of opening books in the game of Go

Author: Audouard Pierre
Chaslot Guillaume
Hoock Jean-Baptiste
Perez J.
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

INRIA a CCSD electronic archive server

The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments

Author: Chaslot Guillaume
Hong Tzung-Pei
Hoock Jean-Baptiste
Hsu Shun-Chin
Lee Chang-Shing
Rimmel Arpad
Teytaud Olivier
Tsai Shang-Rong
Wang Mei-Hui
Publication venue: IEEE Computational Intelligence Society
Publication date: 01/01/2009
Field of study

International audienceTHE AUTHORS ARE EXTREMELY GRATEFUL TO GRID5000 for helping in designing and experimenting around Monte-Carlo Tree Search. In order to promote computer Go and stimulate further development and research in the field, the event activities, "Computational Intelligence Forum" and "World 99 Computer Go Championship," were held in Taiwan. This study focuses on the invited games played in the tournament, "Taiwanese Go players versus the computer program MoGo," held at National University of Tainan (NUTN). Several Taiwanese Go players, including one 9-Dan professional Go player and eight amateur Go players, were invited by NUTN to play against MoGo from August 26 to October 4, 2008. The MoGo program combines All Moves As First (AMAF)/Rapid Action Value Estimation (RAVE) values, online "UCT-like" values, offline values extracted from databases, and expert rules. Additionally, four properties of MoGo are analyzed including: (1) the weakness in corners, (2) the scaling over time, (3) the behavior in handicap games, and (4) the main strength of MoGo in contact fights. The results reveal that MoGo can reach the level of 3 Dan with, (1) good skills for fights, (2) weaknesses in corners, in particular for "semeai" situations, and (3) weaknesses in favorable situations such as handicap games. It is hoped that the advances in artificial intelligence and computational power will enable considerable progress in the field of computer Go, with the aim of achieving the same levels as computer chess or Chinese chess in the future

HAL-CentraleSupelec

CiteSeerX

Maastricht University Research Portal

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

Author: Chaslot Guillaume
Chatriot Louis
Fiter Christophe
Gelly Sylvain
Hoock Jean-Baptiste
Perez J.
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Lavoisier'
Publication date: 01/01/2008
Field of study

INRIA a CCSD electronic archive server

Monte-Carlo Tree Search in Backgammon

Author: Chaslot Guillaume
Uiterwijk Jos W.H.M.
Van Lishout François
Publication venue
Publication date: 01/01/2007
Field of study

peer reviewedMonte-Carlo Tree Search is a new method which has been applied successfully to many games. However, it has never been tested on two-player perfect-information games with a chance factor. Backgam- mon is the reference game of this category. Today’s best Backgammon programs are based on reinforcement learning and are stronger than the best human players. These programs have played millions of offline games to learn to evaluate a position. Our approach consists rather in playing online simulated games to learn how to play correctly in the current position

Open Repository and Bibliography - Liège

Monte-Carlo Tree Search

Author: Chaslot Guillaume Maurice Jean-Bernard Chaslot
Publication venue: Maastricht University
Publication date: 01/01/2010
Field of study

Maastricht University Research Portal